Metric distances derived from cosine similarity and Pearson and Spearman correlations
نویسندگان
چکیده
We investigate two classes of transformations of cosine similarity and Pearson and Spearman correlations into metric distances, utilising the simple tool of metric-preserving functions. The first class puts anti-correlated objects maximally far apart. Previously known transforms fall within this class. The second class collates correlated and anti-correlated objects. An example of such a transformation that yields a metric distance is the sine function when applied to centered data.
منابع مشابه
Blind Assessment of Wavelet-Compressed Images Based On Subband Statistics of Natural Scenes
This paper presents a no-reference image quality assessment metric that makes use of the wavelet subband statistics to evaluate the levels of distortions of wavelet-compressed images. The work is based on the fact that for distorted images the correlation coefficients of the adjacent scale subbands change proportionally with respect to the distortion of a compressed image. Subband similarity is...
متن کاملDocument dissimilarity within and across languages: A benchmarking study
Quantifying the similarity or dissimilarity between documents is an important task in authorship attribution, information retrieval, plagiarism detection, text mining, and many other areas of linguistic computing. Numerous similarity indices have been devised and used, but relatively little attention has been paid to calibrating such indices against externally imposed standards, mainly because ...
متن کاملSemantic Cosine Similarity
Cosine similarity is a widely implemented metric in information retrieval and related studies. This metric models a text as a vector of terms and the similarity between two texts is derived from cosine value between two texts' term vectors. Cosine similarity however still can't handle the semantic meaning of the text perfectly. This paper proposes an enhancement of cosine similarity measurement...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملBiweight Correlation as a Measure of Distance between Genes on a Microarray
Motivation: The underlying goal of microarray experiments is to identify genetic patterns across different experimental conditions. Genes that are contained in a particular pathway or that respond similarly to experimental conditions should be co-expressed and show similar patterns of expression on a microarray. Using any of a variety of clustering methods or gene network analyses we can partit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1208.3145 شماره
صفحات -
تاریخ انتشار 2012